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GROWTH FACTOR RECEPTOR- BINDING INSULIN RECEPTOR 



Field of the Invention 

The present invention relates to an isolated 
5 isoform of human growth factor receptor-binding insulin- 
receptor protein (GrbIR-1) gene; to essentially pure 
human GrbIR-1 protein; and to compositions and methods 
of producing and using human GrbIR-1 sequences and 
proteins . 

10 

Background of the Invention 

A number of polypeptide growth factors and hormones 
mediate their cellular effects through a signal 
transduction pathway. Transduction of signals from the 

15 cell surface receptors for these ligands to 
intracellular effectors frequently involves 
phosphorylation or dephosphorylation of specific protein 
substrates by regulatory protein tyrosine kinases (PTK) 
and phosphatases. Tyrosine phosphorylation is a major 

20 mediator of signal transduction in multicellular 
organisms. Receptor-bound, membrane-bound and 
intracellular PTKs regulate cell proliferation, cell 
differentiation and signalling processes in immune 
system cells. 

25 Aberrant PTK activity has been implicated or is 

suspected in a number of pathologies such as diabetes, 
atherosclerosis, psoriasis, septic shock, bone loss, 
anemia, many cancers and other proliferative diseases. 
Accordingly, tyrosine kinases and the signal 

3 0 transduction pathways which they are part of are 

potential targets for drug design. For a review, see 
Levitzki ec al. in Science 261, 1782-1788 (1995). 

Many of the proteins comprising signal transduction 
pathways are present at low levels and often have 

35 opposing activities. The properties of these signalling 
molecules allow the cell to control transduction by 
means of the subcellular location and juxtaposition of 
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effectors. as well as by balancing activation with 
repression such that a small change in one pathway can 
achieve a switching effect. 

The formation of transducing complexes by 
juxtaposition of the signalling molecules through 
protein-protein interactions are mediated by specific 
docking domain sequence motifs. Src homology 2 (SH2) 
domains, which are conserved non-catalytic sequences of 
approximately 100 amino acids found in a variety of 
signalling molecules such as non-receptor PTKs and 
kinase target effector molecules and in oncogenic 
proteins, play a critical role. The SH2 domains are 
highly specific for short phosphotyrosine-containing 
peptide sequences found in autophosphorylated PTK 
receptors or intracellular tyrosine kinases. Src 
homology 3 (SH3) domains, conserved sequences of 
approximately 50 amino acids that mediate protein- 
protein interactions through sequence-specific binding 
to proline-rich motifs in target proteins, are also 
critically involved in signal transduction. Pleckstrin 
homology (PH) domains are also involved in signal 
transduction and control membrane association of 
signaling molecules. See G. Shaw, Bioessays 18, 3 5-46 
(1996). At least 90 proteins having conserved SH2 , SH3 
or PH domains, and, in many cases, distinct catalytic 
domains, are now known. 

One approach towards the pharmacological regulation 
of signal transduction pathways is to design ligands 
which selectively bind to a chosen PH domain and thus 
affect the interaction of membrane-associated inositol 
1, 4 , 5-trisphosphate with its PH domain-containing target 
molecule, thereby modulating signal transduction. Any 
selective modulators would provide a useful lead for 
drug development. 

Growth factor receptor binding protein-Inuslin 
Receptor (Grb-IR) is a cytoplasmic signalling molecule 
containing an SH2 domain and a partial PH domain with a 
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pleckstrin homology domain of phospholipase C-delta 1 
binds with high affinity to phosphatidylinositol 4,5- 
bisphosphate in bi layer membranes) . However, the known 
grb-IR sequence" lacks an intact PH domain. 

The involvement of Grb-IR in the signal 
transduction of the insulin receptor pathway 
necessitates the identification of other human Grb-IR 
homologs and isoforms, preferably those containing 
intact PH domains, and their cDNAs . A need also exists 
for compounds which modulate the activity, of Grb-IR 
homologs and isoforms, for methods to identify such 
modulators and for reagents useful in such methods. 

Summary of the Invention 

Accordingly, one aspect of the present invention is 
an isolated polynucleotide selected from the group 
consisting of: 

(a) a polynucleotide encoding human GrbIR-1 having 
the nucleotide sequence as set forth in SEQ ID N0:1 from 
nucleotide 289 to 1897; 

(b) a polynucleotide capable of hybridizing to the 
complement of a polynucleotide according to (a) under 
moderately stringent hybridization conditions and which 
encodes a functional human GrbIR-1; and 

(c) a degenerate polynucleotide according to (a) 
or (b) . 

Another aspect of the invention is a functional 
polypeptide encoded by the polynucleotides of the 
invention. 

Another aspect of the invention is a method for 
preparing essentially pure human GrbIR-1 protein 
comprising culturing a recombinant host cell comprising 
a vector comprising a polynucleotide of the invention 
under conditions promoting expression of the protein and 
recovery thereof. 
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wide tissue and cell distribution. The molecule was 
first described by F. Liu and R. A. Roth in Proc Natl. 
Acad. Sci. USA 92, 10287-10291 (1995). Interaction of 
Grb-IR with growth factor receptors such as the insulin 
5 receptor (IR) is mediated by the SH2 domain, can be 

dependent upon receptor tyrosine autophosphorylation and 
involves a direct interaction between Grb-IR and the 
phosphorylated receptors. 

Further, binding of Grb-IR to the insulin receptor 

10 has been shown to inhibit subsequent signalling events 
such as insulin-dependent tyrosine phosphorylation of a 
60k GAP-associated protein, IRS-1 and insulin induced 
association of phosphatidyl inositol-3 kinase with IRS-1 
(Liu and Roth, supra) . Thus, Grb-IR inhibits insulin 

15 signalling through the IR. Membrane association of 

signalling 'molecules is important for bringing them in 
close proximity to other effectors. An example is ras 
which is farnesylated at the C- terminus and thereby 
located to the plasma membrane. The importance of such 

20 localization is shown by the inhibitory effect of 

farnesyl transferase inhibitors on ras-mediated signal 
transduction. See Tamanoi, F., Trends in Biochemical 
Sciences 18 , 349-353 (1993). 

In the case of grb-IR, a PH domain could serve a 

25 similar purpose, since PH domains are known to 

facilitate membrane association of proteins through 
binding to inositol 1 , 4 , 5-trisphosphate residues in cell 
membranes. See H. F. Paterson et al . , Biochem. J. 312, 
661-666 (1995). Phospholipase C delta 1 requires a 

30 pleckstrin homology domain for interaction with the 
plasma membrane. See D. S. Wang & G-. Shaw, Biochem. 
Biophys. Res. Commun. 217, 608-615 (1995) . The 
association of the C-terminal region of beta I sigma II 
spectrin to brain membranes is mediated by a PH domain, 

3 5 does not require membrane proteins, and coincides with a 
inositol-1, 4 , 5 triphosphate binding site. See P. Garcia 
etal., Biochemistry 34, 16228-16234 (1995). The 
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Another aspect of the invention is an antisense 
oligonucleotide comprising a sequence which is capable 
of binding to the polynucleotide of the invention. 

Another-aspect of the invention is a modulator of 
the polypeptides of the invention. 

Another aspect of the invention is a method for 
assaying a medium for the presence of a substance that 
modulates GrbIR-1 activity by affecting the binding, of 
GrbIR-1 to cellular binding partners comprising the 
steps of: 

(a) providing a GrbIR-1 protein having the amino 
acid sequence of GrbIR-1 (SEQ ID NO: 2) or a functional 
derivative thereof and a cellular binding partner or 
synthetic analog thereof; 

(b) incubating with a test substance which is 
suspected of modulating GrbIR-1 activity under 
conditions which permit the formation of a GrbIR-1 
protein/cellular binding partner complex; 

(c) assaying for the presence of the complex, free 
GrbIR-1 protein or free cellular binding partner; and 

(d) comparing to a control to determine the effect 
of the substance. 

Another aspect of the invention is a method for 
assaying for the presence of a substance that modulates 
GrbIR-1 activity by direct binding to GrbIR-1 protein 
comprising the steps of: 

(a) providing a labelled GrbIR-1 protein having 
the amino acid sequence of GrbIR-1 (SEQ ID NO: 2) or a 
functional derivative thereof 

(b) providing solid support-associated modulator 
candidates; 

(c) incubating a mixture of the labelled GrbIR-1 
protein with the support-associated modulator candidates 
under conditions which can permit the formation of a 
GrbIR-1 protein/modulator candidate complex;' 

(d) separating the solid support from free soluble 
labelled GrbIR-1 protein; 
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(e) assaying for the presence of solid support- 
associated labelled protein; 

(f) isolating the solid support complexed with 
labelled GrbIR-1" protein; and 

5 (g) identifying the modulator candidate. 

Another aspect of the invention is GrbIR-1 protein 
modulating compounds identified by the methods of the 
invention. 

Another aspect of the invention is a method for the 
10 treatment of a patient having need to modulate GrbIR-1 
activity comprising administering to the patient a 
therapeutically effective amount of the modulating 
compounds of the invention. 

Another aspect of the invention is a method of 
treating conditions which are related to insufficient 
GrbIR-1 protein function which comprises: 

(a) isolating cells from a patient deficient in 
GrbIR-1 protein function; 

(b) altering the cells by transfecting the 
polynucleotide of claim 1 into the cells wherein a 
GrbIR-1 protein is expressed; and 

(c) introducing the cells back to the patient to 
alleviate the condition. 

Another aspect of the invention is a method of 
treating conditions which are related to insufficient 
GrbIR-1 protein function which comprises administering 
the polynucleotide of claim 1 to a patient deficient in 
GrbIR-1 protein function wherein a GrbIR-1 protein is 
expressed and alleviates the condition. 

Another aspect of the invention is a transgenic 
non-human animal capable of expressing in any cell 
thereof the DNA encoding the polypeptides of the 
invention. 



6 
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Brief D acription of the Drawings 

Figure 1 is a multiple ainino acid sequence 
alignment of GrbIR-1, Grb-IR, murine GrblO and human 
Grb7. — - V 

5 Figure 2 is an amino acid sequence alignment of 

human GrbIR-1 with human Grb-IR. 

Detailed Description of the Invention 

As used herein, the term "GrblR-l gene" refers to 

10 DNA molecules comprising a nucleotide sequence that 
encodes an isoform of human growth factor receptor 
binding insulin receptor. The GrbIR-1 gene sequence is 
listed in SEQ ID NO:l. The coding region of the GrbIR-1 
gene consists of nucleotides 289 to 1897 of SEQ ID NO:l. 

15 The deduced 53 6 amino acid sequence of the GrbIR-1 gene 
product GrbIR-1 is listed in SEQ ID NO: 2. 

As used herein, the term "functional fragments"' 
when used to modify a specific gene or gene product 
means a less than full length portion of the gene or 

20 gene product which retains substantially all of the 

biological function associated with the full length gene 
or gene product to which it relates. To determine 
whether a fragment of a particular gene or gene product 
is a functional fragment, fragments are generated by 

25 well-known nucleolytic or proteolytic techniques or by 
the polymerase chain reaction and the fragments tested 
for the described biological function. 

As used herein, an "antigen" refers to a molecule 
containing one or more epitopes that will stimulate a 

30 host's immune system to make a humoral and/or cellular 
antigen-specific response. The term is also used herein 
interchangeably with "immunogen." 

As used herein, the term "epitope" refers to the 
site on an antigen or hapten to which a specific 

35 antibody molecule binds. The term is also used herein 
interchangeably with "antigenic determinant" or 
"antigenic determinant site." 

7 
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As used herein, "monoclonal antibody" is understood 
to include antibodies derived from one species (e.g., 
murine, rabbit, goat, rat, human, etc.) as well as 
antibodies derived from two (or perhaps more) species 
(e.g., chimeric and humanized antibodies). 

As used herein, a coding sequence is "operably 
linked to" another coding sequence when RNA polymerase 
will transcribe the two coding sequences into a single 
mRNA, which is then translated into a single polypeptide 
having amino acids derived from both coding sequences. 
The coding sequences need not be contiguous to one 
another so long as the expressed sequence is ultimately 
processed to produce the desired protein. 

As used herein, "recombinant" polypeptides refer to 
polypeptides produced by recombinant DNA techniques; 
i.e., produced from cells transformed by an exogenous DNA 
construct encoding the desired polypeptide. "Synthetic" 
polypeptides are those prepared by chemical synthesis. 

As used herein, a "replicon" is any genetic element 
(e.g., plasmid, chromosome, virus) that functions as an 
autonomous unit of DNA replication in vivo; i.e., 
capable of replication under its own control. 

As used herein, a "vector" is a replicon, such as a 
plasmid, phage, or cosmid, to which another DNA segment 
may be attached so as to bring about the replication of 
the attached segment. 

As used herein, a "reference" gene refers to the 
wild type human GrbIR-1 gene sequence of the invention 
and is understood to include the various sequence 
polymorphisms that exist, wherein nucleotide 
substitutions in the gene sequence exist, but do not 
affect the essential function of the gene product. 

As used herein, a "mutant" gene refers human GrblR- 
1 sequences different from the reference gene wherein 
nucleotide substitutions and/or deletions and/or 
insertions result in perturbation of the essential 
function of the gene product. 

8 
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As used herein, a DNA "coding sequence of" or a 
"nucleotide .sequence encoding" a particular protein, is 
a DNA sequence which is transcribed and translated into 
a polypeptiderwhen placed under the control of 
appropriate regulatory sequences. 

As used herein, a "promoter sequence" is a DNA 
regulatory region capable of binding RNA polymerase in a 
cell and initiating transcription of a downstream (3 • 
direction) coding sequence. For purposes of defining 
the present invention, the promoter sequence is bound at 
its 3' terminus by a translation start codon (e.g. , ' ATG) 
of a coding sequence and extends upstream (5 1 direction) 
to include the minimum number of bases or elements 
necessary to initiate transcription at levels detectable 
15 above background. Within the promoter sequence will be 
found a transcription initiation site (conveniently 
defined by mapping with nuclease SI), as well as protein 
binding domains (consensus sequences) responsible for 
the binding of RNA polymerase. Eukaryotic promoters 
will often, but not always, contain "TATA" boxes and 
"CAT" boxes. Prokaryotic promoters contain Shine- 
Dalgarno sequences in addition to the -10 and -35 
consensus sequences . 

As used herein, DNA "control sequences" refers 
25 collectively to promoter sequences, ribosome binding 
sites, polyadenylation signals, transcription 
termination sequences, upstream regulatory domains, 
enhancers and the like, which collectively provide for 
the expression (i.e., the transcription and translation) 
of a coding sequence in a host cell. 

As used herein, a control sequence "directs the 
expression" of a coding sequence in a cell when RNA 
polymerase will bind the promoter sequence and 
transcribe' the coding sequence into mRNA, which is then 
35 translated into the polypeptide encoded by the coding 
sequence. 



20 
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As used. herein, a "host cell" is a cell which has 
been transformed or transfected, or is capable of 
transformation or trans fection by an exogenous DNA 
sequence. — 

As used herein, a cell has been "transformed" by 
exogenous DNA when such exogenous DNA has been 
introduced inside the cell membrane. Exogenous DNA may 
or may not be integrated (covalently linked) into 
chromosomal DNA making up the genome of the cell. In 
prokaryotes and yeasts, for example, the exogenous DNA 
may be maintained on an episomal element, such as a 
plasmid. With respect to eukaryotic cells, a stably 
transformed or transfected cell is one in which the 
exogenous DNA has become integrated into the chromosome 
so that it is inherited by daughter cells through 
chromosome replication. This stability is demonstrated 
by the ability of the eukaryotic cell to establish cell 
lines or clones comprised of a population of daughter 
cells containing the exogenous DNA. 

As used herein, " trans feet ion" or "transfected" 
refers to a process by which cells take up foreign DNA 
and integrate that foreign DNA into their chromosome. 
Transfection can be accomplished, for example, by 
various techniques in which cells take up DNA (e.g., 
calcium phosphate precipitation, electroporation, 
assimilation of liposomes, etc.) or by infection, in 
which viruses are used to transfer DNA into cells. 

As used herein, a "target cell" is a cell that is 
selectively transfected over other cell types (or cell 
lines) . 

As used herein, a "clone" is a population of cells 
derived from a single cell or common ancestor by 
mitosis. A "cell line" is a. clone of a primary cell 
that is capable of stable growth in vitro for many 
generations. 
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As used herein, a "heterologous" region of a DNA 
construct is an identifiable segment of DNA within or 
attached to another DNA molecule that is not found in 
association with the other molecule in nature. Thus, 
5 when the heterologous region encodes a gene, the gene 
will usually be flanked by DNA that does not flank the 
gene in the genome of the source animal. Another example 
of a heterologous coding sequence is a construct where 
the coding sequence itself is not found in nature (e.g., 

10 synthetic sequences having codons different from the 

native gene) . Allelic variation or naturally occurring 
mutational events do not give rise to a heterologous 
region of DNA, as used herein. 

As used herein, a "modulator" of a polypeptide is a 

15 substance which can affect the polypeptide function. 

An aspect of the present invention is isolated 
polynucleotides encoding a human GrbIR-1 protein 
including substantially similar sequences and functional 
fragments . Isolated polynucleotide sequences are 

20 substantially similar if they are capable of hybridizing 
under moderately stringent conditions to SEQ ID NO:l or 
they encode DNA sequences which are degenerate to SEQ ID 
NO:l or are degenerate to those sequences capable of 
hybridizing under moderately stringent conditions to SEQ 

25 ID NO:l. 

Moderately stringent conditions is a term 
understood by the skilled artisan and has been described 
in, for example, Sambrook et al . Molecular Cloning: A 
Laboratory Manual, 2nd edition, Vol. 1, pp. 101-104, 

30 Cold Spring Harbor Laboratory Press (1989). An 
exemplary hybridization protocol using moderately 
stringent conditions is as follows. Nitrocellulose 
filters are prehybridized at 65°C in a solution 
containing 6X SSPE, 5X Denhardt ' s solution (lOg Ficoll, 

35 lOg BSA and lOg polyvinylpyrrolidone per liter 

solution), 0.05% SDS and 100 ug/ml tRNA. Hybridization 
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probes are labeled, preferably radiolabeled (e.g., 
using the Bios TAG-IT® kit) . Hybridization is then 
carried out for approximately 18 hours at 65°C. The 
filters are then washed twice in a solution of 2X SSC 
and 0.5% SDS at room temperature for 15 minutes. 
Subsequently, the filters are washed at 58°C, air-dried 
and exposed to X-ray film overnight at -70°C with an 
intensifying screen. 

Degenerate DNA sequences encode the same amino acid 
sequence as SEQ ID NO: 2 or the proteins encoded by that 
sequence capable of hybridizing under moderately 
stringent conditions to SEQ ID N0:1, but have 
variation (s) in the nucleotide coding sequences because 
of the degeneracy of the genetic code. For example, the 
degenerate codons UUC and UUU both code "for the amino 
acid phenylalanine, whereas the four codons GGX all code 
for glycine. 

Alternatively, substantially similar sequences are 
defined as those sequences in which about 70%, 
preferably about 80% and most preferably about 9 0%, of 
the nucleotides or amino acids match over a defined 
length of the molecule. As used herein, substantially 
similar refers to the sequences having similar identity 
to the sequences of the instant invention. Thus 
nucleotide sequences that are substantially the same can 
be identified by hybridization or by sequence 
comparison. Protein sequences that are substantially 
the same can be identified by techniques such as 
proteolytic digestion, gel electrophoresis and/or 
microsequencing. Excluded from the definition of 
substantially similar sequences is Grb-IR. 

Embodiments of the isolated polynucleotides of the 
invention include DNA, genomic DNA and RNA, preferably 
of human origin. A method for isolating a nucleic acid 
molecule encoding a GrblR-1 protein is to probe a 
genomic or cDNA library with a natural or artificially 
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designed probe using art . recognized procedures. See, 
e.g., "Current Protocols in Molecular Biology", Ausubel 
et ai. (eds.) Greene Publishing Association and John 
Wiley Interscience, New York, 1989,1992. The ordinarily 
5 skilled artisan will appreciate that SEQ ID N0:1 or 
fragments thereof comprising at least 15 contiguous 
nucleotides are particularly useful probes. It is also 
appreciated that such probes can be and are preferably 
labeled with an analytically detectable reagent to 

10 facilitate identification of the probe. Useful reagents 
include, but are not limited to, radioisotopes, 
fluorescent dyes or enzymes capable of catalyzing the 
formation of a detectable product. The probes would 
enable the ordinarily skilled artisan to isolate 

15 complementary copies of genomic DNA, cDNA of RNA 

polynucleotides encoding GrbIR-1 proteins from human, 
mammalian or other animal sources or to screen such 
sources for related sequences, e.g., additional members 
of the family, type and/ or subtype, including 

20 transcriptional regulatory and control elements as well 
as other stability, processing, translation and tissue 
specificity-determining regions from 5' and/or 3* 
regions relative to the coding sequences disclosed 
herein, all without undue experimentation. 

25 Another aspect of the invention is functional 

polypeptides encoded by the polynucleotides of the 
invention. An embodiment of a functional polypeptide of 
the invention is the human GrbIR-1 protein having the 
amino acid sequence set forth in SEQ ID NO: 2. 

30 Another aspect of the invention is a method for 

preparing . essentially pure human GrbIR-1 protein. Yet 
another aspect is the human GrbIR-1 protein produced by 
the preparation method of the invention. This protein 
has the amino acid sequence listed in SEQ ID NO: 2 and 

3 5 includes variants with a substantially similar amino acid 
sequence that have the same function. The proteins of 
this invention are preferably made by recombinant genetic 
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engineering techniques by culturing a recombinant host 
cell containing a vector encoding the polynucleotides of 
the invention under conditions promoting the expression 
of the protein "and recovery thereof. 

The isolated polynucleotides, particularly the DNAs, 
can be introduced into expression vectors by operatively 
linking the DNA to the necessary expression control 
regions, e.g. , regulatory regions, required for gene 
expression. The vectors can be introduced into an 
appropriate host cell such as a prokaryotic, e.g., 
bacterial, or eukaryotic, e.g., . yeast or mammalian cell 
by methods well known in the art. See Ausubel et al. . 
supra. The coding sequences for the desired proteins, 
having been prepared or isolated, can be cloned into any 
suitable vector or replicon. Numerous cloning vectors 
are known to those of skill in the art and the selection 
of an appropriate cloning vector is a matter of choice. 
Examples of recombinant DNA vectors for cloning and host 
cells which they can transform include, but are not 
limited to, the bacteriophage X (E. coli) , pBR322 (E. 
coli) , pACYC177 (E. coli), pGEX4T-3 (E. coli), pKT230 
(gram-negative bacteria) , pGV1106 (gram-negative 
bacteria), pLAFRl (gram-negative bacteria) , pME290 (non- 
E. coli gram-negative bacteria), pHV14 (E. coli and 
Bacillus subtilis) , pBD9 {Bacillus), pIJ61 
(Streptomyces), pUC6 (Streptomyces) , Yip5 
(Saccharomyces) , a baculovirus insect cell system, a 
Drosophila insect system, YCpl9 (Saccharomyces) and 
pSV2neo (mammalian cells). See generally, "DNA Cloning": 
Vols. I & ii. Glover et al. ed. IRL Press Oxford (1985) 
(1987); and T. Maniatis et al. ("Molecular Cloning" Cold 
Spring Harbor Laboratory (1982). 

The gene can be placed under the control of control 
elements such as a promoter, ribosome binding site (for 
bacterial expression) and, optionally, an operator, so 
that the DNA sequence encoding the desired protein is 
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transcribed into RNA in the host cell transformed by a 
vector containing the expression construct. The coding 
sequence may or may not contain a signal peptide or ' 
leader sequence. The proteins' of the present: invention 
5 can be expressed using, for example, the B. coli tac 

promoter or the protein A gene (spa) promoter and signal 
sequence. Leader sequences can be removed by the 
bacterial host in post-translational processing. See, 
e.g., U.S. Patent Nos. 4,431,739; 4,425,437 and 
10 4,338,397. 

In addition to control sequences, it may be 
desirable to add regulatory sequences which allow for 
regulation of the expression of the protein sequences 
relative to the growth of the host cell. Regulatory 
sequences are known to those of skill in the art 
Exemplary are those which cause the expression of a gene 
to be turned on or off in response to a chemical or : 
physical stimulus, including the presence of a 
regulatory compound or to various temperature or 
metabolic conditions. Other types of regulatory 
elements may also be present in the vector, for example, 
enhancer sequences. 

An expression vector is constructed so that the 
particular coding sequence is located in the vector with 
the appropriate regulatory sequences, the positioning 
and orientation of the coding sequence with respect to 
the control sequences being such that the coding 
sequence is transcribed under the "control" of the 
control sequences, i.e., RNA polymerase which binds to 
the DNA molecule at the control sequences transcribes 
the coding sequence. Modification of the sequences 
encoding the particular antigen of interest may be 
desirable to achieve this end. For example, in some 
cases it. may be necessary to modify the sequence so that 
it may be attached to the control sequences with the 
appropriate orientation; i.e., to maintain the reading 
frame. The control sequences and other regulatory 

15 



20 



25 



30 



WO98/01475 

PCT/US96/U452 

sequences may be ligated to the coding sequence prior to 
insertion into a vector, such as the c l oning vec tors 
described above. Alternatively, the coding sequence can 
be cloned directly into an expression vector which 
5 already contains the control sequences and an 
appropriate restriction site. 

in some cases, it may be desirable to produce 
mutant, or analogues of human GrbIR-1 protein. Mutants 
or analogues may be prepared by the deletion of a 
10 portion of the sequence encoding the protein, by 

insertion of a sequence, and/or by substitution of one 
or more nucleotides within the sequence. Techniques for 
modifying nucleotide sequences, such as site-directed 
mutagenesis, are well known to those skilled in the art. 
See, e.g., T . Maniatis fl£ ^ ^ „^ Cloning, " 
Vols. I and II, supra; and ,. Nucle . c Ac . d Hybr . di2at . onii 



Depending on the expression system and host 
selected, the proteins of the present invention are 
20 produced by growing host cells transformed by an 
expression vector described above under conditions 
whereby the protein of interest is expressed. Preferred 
mammalian cells include human embryonic kidney cells 
J293), monkey kidney cells, fibroblast (COS) cells, 
Chinese hamster ovary (CHO, cells, Drosophil, or murine 
b-cells. If the expression system secretes the protein 
into growth media, the protein can be purified directly 
from the media, if the protein is not secreted, it is 

30 IT^ lySat6S ° r rSCOVered f "« cell 

r^"' ^ SSleCti0n ° f the ■Wropri.t. 

grow h conditions and recovery methods are within the 
SKiii of the art. 

An alternative method to identify proteins of the 

35 L reSSnt " nV6nti0n " * instructing gene libraries, 
35 using the resulting clones to transform E. coli and 
Pooling and screening individual colonies using 
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produced by epical synthesis such as solid' phase 

::: thesis on - — 

using toown anuno acid sequences or 

delved fro, the DNA sequence of the genes of interest 
Such methods are known to those skilled i„ the art 
The proteins of the present invention or their 

rooT" C T iSinS " UaSt Me - b. used to 

Produce antibodies, both polyclonal and Monoclonal 

Erected to epitopes corresponding to amino acid 

sequences disclosed herein. If polycional antibodies 

it ::r d ' aselected such - • • » ^ 

goat or horse is ionized „ ith a protein ' 
nvent.cn or its fragment, or a mutant protein . ™ 
fro, the immunized animal is coUected 

according to too™ procedures. Serum polyclonal 
antibodies can be purified by immunoaf Unity 
chromatography or other known procedures 

Monoclonal antibodies to the proteins of the 

also be readUy produced by one skilled in the art The 

uXtrT 0109 ' ' Mkin9 antibodies by 

ant h T teCta0l °^ - l-own. Mortal 

antibody-producing cell lin es can be created by cell 
us a on and also by other techniques such as diLt 
rans ormation of B lymphocytes with oncogenic DNA or 
transfection „ lt h Epstein-Barr virus. See. e.g „ 
schreier et aj., -Hybridoma Techniques- (19801- 

Ham»erli„g et al., -Monoclonal Antibodies and T-cell 
Hybrid , 1Ml)j Kemett n ^ , Mono 

T 9? m Vi,?!' ^ "' S - P " ent Nos - 4 ' 3 «-'«- 

' ' ' ' «.«*. S 7., 4.466.91,; 

4.472.500; 4,491,632.. and 4,493,890. Panels of 

sTorT° dies proaucea a9ainsc ^ " 

interest, or fragment thereof, can be screened for 
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various properties, i.e., for isotype. epitope, 
affinity, etc. Monoclonal antibodies are useful i„ 
purification, using immunoaf finity techniques, of the 
individual aniens which they are directed agains 
Alternatively, genes encoding the mcnoclonals of 
interest .ay be isolated from the hybridous by PCR 

^ Che a " - — tressed in 

the appropriate vectors. The antibodies of this 
invention, whether polyclonal or monoclonal have 
additional utility i„ that they may be employed as 
"agents in immunoassays. RIA, ELISA, and the like The 
antibodies of the invention can be labeUd wit t' 

alytically detectable reagent such as a radioisotope 
fluorescent molecule or enzyme. 

Chimeric antibodies, in which non-human variable 

tsee, e.g., Liu et al Pror at**-? * ^ 

rroc. Natl Acad. Sci usa Ad 

3 39 U »87, , . may also be used in assays or ■ 
therapeutically. Preferably, a therapeutic monoclonal 

.1901 l. n.™ ' ai -' ^ «7, 1709 



1991); Queen et al., Proc . Natl Acad 
10029 (1989); Gorman et al., Proc Natl T f 
25 88, 34181 (1991, „ „ „ ACad " Sci ' 

9:. 421 U9 91 T ,; "* H ° d9S ° n ^ ^ /Technology , 
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Another aspect of the present invention is 
modulators of the polypeptides of the invention. 
Fun et ion al modulation of 0 rb IR -l by a substance includes 
Part al to compute inhibition of f unction , ldentical 
function, as well as enhancement of function 
Embodiments of modulators of the invention include 
peptides, oligonucleotides and small organic molecules 
including peptidomimetics. 

Another aspect of the invention is antisense 
o igonucleotides comprising a sequence which is capable 
=f binding to the polynucleotides of the invention 
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Synthetic oligonucleotides or related antisense chemical 
structural analogs can be designed to recognize, 
specifically- bind to and prevent transcription of a 
target nucleic acid encoding GrbIR-1 protein* by those of 
5 ordinary skill in the art. See generally, Cohen, J.S., 
Trends in Pharm. Sci . , 10, 435(1989). and Weintraub, 
H.M., Scientific American,. January (1990) at page 40. 

Another aspect of the invention is a method for 
assaying a medium for the presence of a substance that 

10 modulates GrbIR-1 protein function by affecting the 

binding of GrbIR-1 protein to cellular binding partners. 
Examples of modulators include, but are not limited to 
peptides and small organic molecules including 
peptidomimetics . A GrbIR-1 protein is provided having 

15 the amino acid sequence of human GrbIR-1 (SEQ ID NO:2) 
or a functional derivative thereof together with a 
cellular binding partner or synthetic analog thereof. 
The mixture is incubated with a test substance which is 
suspected of modulating GrbIR-1 activity, under 

20 conditions which permit the formation of a GrbIR-1 gene 
product /cellular binding partner complex. An assay is 
performed for the presence of the complex, free GrbIR-1 
protein or free cellular binding partner and the result 
compared to a control to determine the effect of the 

25 test substance. 

Another aspect of the invention is a method for 
assaying for the presence of a substance that modulates 
GrbIR-1 activity by direct binding to GrbIR-1 protein. 
Examples of modulators include, but are not limited to, 

30 peptides and small organic molecules including 

peptidomimetics. Modulator candidates are synthesized 
on a solid support by techniques such as those disclosed 
in Lam et al., Nature 354 , 82 (1991) or Burbaum et al . , 
Proc. Natl. Acad. Sci. USA 92, 6027 (1995) to provide 

35 solid support-associated modulator candidates. A 

labelled GrbIR-1 protein is provided having the amino 
acid sequence of human GrbIR-1 (SEQ ID NO: 2) or a 
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functional derivative thereof. Exemplary labels include 
directly attached fluorescent or colored dyes, biotin, 
radioisotopes or epitope tags, which are detectable by a 
suitable antibody. A mixture of" "solid support- 
5 associated modulator candidates and labelled GrbIR-1 
protein is incubated under conditions which can permit 
the formation of a GrbIR-1 protein/modulator candidate 
complex. The solid support is separated from free 
soluble labelled GrbIR-1 protein. An assay is performed 
10 for the presence of solid support-associated labelled 

protein. Solid supports complexed with labelled protein 
are isolated and the identity of the modulator candidate 
determined by techniques well known to those skilled in 
the art. 

15 Modulation of GrbIR-1 function would be expected to 

be useful for treatment of diabetes. Inhibition of 
grbIR-1 could be effected through antagonism of the SH2 
domain/phosphorylated IR interaction or through 
inhibition of the binding of the PH domain to 

20 phosphatidylinositol 4, 5-bisphosphate. 

Further, GrbIR-1 could be used to isolate proteins 
which interact with it and this interaction could be a 
target for interference. Inhibitors of protein-protein 
interactions between GrbIR-1 and other factors could 

25 lead to the development of pharmaceutical agents for the 
modulation of GrbIR-1 activity. 

Methods to assay for protein-protein interactions, 
such as that of a GrbIR-1 gene product /binding partner 
complex, and to isolate proteins interacting with GrblR- 

30 1 are known to those skilled in the art. Use of the 

methods discussed below enable one of ordinary skill in 
the art to accomplish these aims without undue 
experimentation . 

The yeast two-hybrid system provides methods for 

35 detecting the interaction between a first test protein 
and a second test protein, in vivo, using reconstitution 
of the activity of a transcriptional activator. The 

20 
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method is disclosed in U.S. Patent No. 5,283,173; 
reagents are available from Clontech and Stratagene. 
" Briefly, GrbIR-1 cDNA is fused to a Gal4 transcription 
factor DNA binding domain and expressed in yeast cells. 
5 cDNA library members obtained from cells of interest are 
fused to a transactivation domain of Gal4. cDNA clones 
which express proteins, which can interact with GrbIR-1 
will lead to reconstitution of Gal4 activity and 
transactivation of expression of a reporter gene such as 

10 Gall-lacZ. Optionally, the host cells can' be co- 
trans fee ted with a protein tyrosine kinase to induce 
tyrosine phosphorylation of members of the cDNA library. 
Such phosphorylation is necessary for optimum 
interaction with the S82 domain of GrbIR-1. 

15 An alternative method is screening of fcgtll, XZAP 

(Stratagene) or equivalent cDNA expression libraries 
with recombinant Grbll-l • Recombinant GrbIR-1 protein 
or fragments thereof are fused to small peptide tags 
such as FLAG, HSV or GST. The peptide tags can possess 

20 convenient phosphorylation sites for a kinase such as 
heart muscle creatine kinase or they can be 
biotinylated. Recombinant GrbIR-1 can be phosphorylated 
with 32 [P] or used unlabeled and detected with 
streptavidin or antibodies against the tags. XgtllcDNA 

25 expression libraries are made from cells of interest and 
are incubated with the. recombinant GrbIR-1, washed and 
cDNA clones isolated which interact with GrbIR-1. See, 
e.g., T. Maniatis et al, supra. 

Another method is the screening of a mammalian 

30 expression library in which the cDNAs are cloned into a 
vector between a mammalian promoter and polyadenylation 
site and transiently transfected in COS or 293 cells 
followed by detection of the binding protein 48 hours 
later by incubation of fixed and washed cells with a 

35 labelled GrbIR-1. prefereably iodinated, and detection 
of bound GrbIR-1 by autoradiography. See Sims et al.. 
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To ITJIL 585-589 ,1988 ' and McMahM •< «*■■ j. 

containing the C DNA en „ ding th? bind . ns protein of 

cvc e f " SUM1ViSio " ° £ P~l flowed by 

cycles of transient transfection, binding and 

ca?be dl09 r Phy ' M " r ~"™ 1 *- «» cDNA of interest 
"to ^ Cra " SfeCti ^ «- -tir. cDNA library 

-"^Uancellsandpanning.theceUsonadish 

17 f : GrbIR " 1 ^ " Plat - Ceil = *~h 
attach after washing are lysed and the plasmid DNA 

isolated, amplified in bacteria, and the cycle of 

transfection and panning repeated until a single cDNA 

clone ls obtained. See Seed « al. P roc. mtl . Aca T 

III " bindin9 'i» —ted. its 

.=» can be obtained by a simUar pooling strategy once 

ass ::: s s or - <„ 

assaying supernatants from transiently transfected 

dlsclo T erSl meth ° dS f ° r SCrMnin9 -P«"«ants are 
disclosed m Wong « al., science 2 2 S , 810-815 ,„„, 

Another alternative method is isolation of proteins 
interacting with SrbrR-i directly from cells. ZsiT 

25 IT 1 ": °' GrMR " 1 " ith ° ST M SMU ""«*"• «gs are 

" un " d " "■etically'iabeled 
or unlabeled protein extrarf? f .„„ 

<=xu excracts from the cells of interest 

™ Prepared, incubated with the beads and washed wjh 
buffer Proteins interacting with GrbIK-1 are eluted 

Binding partner primary amino acid sequence data are 
obtained by microseguencing. Optionally, the cells can 
be treated with agents that induce a functiona espTe 

Zs^uT Ph ° SPh °- la "°" <* -otelnT 
example of such an agent would be a growth factor or 
cytokine such as interleukin-2 . 

Another alternative method is immunoaff inity 
Purification. Recombinant CrblR-i is incubated „ Uh 

22 



20 



30 



35 



10 



W ° 98/0,475 PC1YUS96/H452 
labeled or unlabeled cell extracts and 
immune-precipitated with anti-GrbIR-l antibodies. The 
immunoprecipitate is recovered with protein A-Sepharose 
and analyzed -by. SDS-PAGE. Unlabeled proteins are 
labeled by biotinylation and detected on SDS gels- with 
streptavidin. Binding partner proteins are analyzed by 
microsequencing. Further, standard biochemical 
purification steps known to those skilled in the art may 
be used prior to microsequencing. 

Yet another alternative method is screening of 
peptide libraries for binding partners. Recombinant 
tagged or labeled GrbIR-1 is used to select peptides 
from a peptide or phosphopeptide library which interact 
with GrbIR-1. Sequencing of the peptides leads to 
15 identification of consensus peptide sequences which 
might be found in interacting proteins. 

GrbIR-1 binding partners identified by any of these 
methods or other methods which would be known to those 
of ordinary skill in the art as well as those putative 
binding partners discussed above can be used in the 
assay method of the invention. Assaying for the 
presence of GrbIR-1 /binding partner complex are 
accomplished by. for example, the yeast two-hybrid 
system, ELISA or immunoassays using antibodies specific 
for the complex, in the presence of test substances 
which interrupt or inhibit formation of GrbIR-1 /binding 
partner interaction, a decreased amount of complex will 
be determined relative to a control lacking the test 
substance. 

Assays for free GrbIR-1 or binding partner are 
accomplished by, for example, ELISA or immunoassay using 
specific antibodies or by incubation of radiolabeled 
GrbIR-1 with, cells or cell membranes followed by 
centrifugation or filter separation steps. In the 
35 presence of test substances which interrupt or inhibit 
formation of GrbIR-1/ binding partner interaction, an 
increased amount of free GrbIR-1 or free binding partner 
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will be determined relative to a control lacking the 
test substance. 

Another aspect of the invention is pharmaceutical 
_ compositions Uprising an effective amount of a GrbIR-1 
> modulator of the invention and a pharmaceutical ly 
acceptable carrier. Pharmaceutical compositions of 
modulators of this invention for parenteral 
administration, i.e.7 subcutaneously, intramuscularly or 
^ravenously or oral administration can be prepared 

common! C ° mPOSUi ° nS ^ teral administration will 

commonly comprise a solution of the modulators of the 
invention or a cocktail thereof dissolved in an 
acceptable carrier, preferably an aqueous carrier a 
variety of aqueous carriers may be employed, . g ' 

like. Tnese solutions are sterile and generally free of 
Particulate matter. These solutions may be sterilized 
by conventional, well-known sterilization techniques 
The compositions may contain pharmaceutical^ acceptable 
auxiliary substances as required to approximate 
Physiological conditions such as P H adjusting and 
buffering agents, etc. The concentration of the 
modulator of the invention in such pharmaceutical 
ormulation can vary widely, i. e . ( from less than 

or 0% b T " " leaSt ab ° Ut 1% t0 - »"* - " 

or 20% by weight and will be selected primarily based on 

fluid volumes, viscosities, etc. according to the 
particular mode of administration selected. 

Thus, a pharmaceutical composition of the modulator 
o the invention for intramuscular injection could be 
Prepared to contain 1 mL sterile buffered wa « 
«9 of a protein of the invention, snarly a 
Pharmaceutical composition of the modulator of the 
invention for intravenous infusion could be made up to 
c-ain 250 ml of sterile Ringer, solution, and 50 mg 
o a modulator of the invention. Actual methods for 
Preparing parenteral^ administrable compositions are 
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well known or will be apparent to those skilled in the 
art and are described in more detail i n , for example 
Remington's Pharmaceutical Science, 15th ed., Mack 
Publishing Company, Easton, Pennsylvania. * 
5 The physician will determine the dosage of the 

present therapeutic agents which will be most suitable 
and it will vary with the form of administration and the 
particular compound chosen, and furthermore, it will 
vary with the particular patient under treatment. 
10 Generally, the physician will wish to initiate treatment 
with small dosages substantially less than the optimum 
dose of the compound and increase the dosage by small 
increments until the optimum effect under the 
circumstances is reached, it will generally be found 
that when the composition is administered orally, larger 
quantities of the active agent will be required to 
produce the same effect as a smaller quantity given ; 
parenteral^. The therapeutic dosage will generally be 
from 1 to 10 milligrams per day and higher although it 
may be administered in several different dosage units. 

Depending on the patient condition, the 
Pharmaceutical composition of the invention can be 
administered for prophylactic and/or therapeutic 
treatments, m therapeutic application, compositions 
are administered to a patient already suffering from a 
disease m an amount sufficient to cure or at least 
partially arrest the disease and its complications. In 
prophylactic applications, compositions containing the 
present compounds or a cocktail thereof are administered 
to a patient not already in a disease state to enhance 
the patient's resistance to the disease. . 

Single or multiple administrations. of the 
Pharmaceutical compositions can be carried out with -dose 
levels and pattern being selected by the treating 
Physician. In any event, the pharmaceutical composition 
of the invention should provide a quantity of the 
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modulators of the invention sufficient to effectively 
treat the patient. 

Additionally, some diseases result from inherited 
defective genes : These genes can be detected by 
comparing the sequence of the defective gene with that 
of a normal one." Individuals carrying mutations in the 
GrblR-1 gene may be detected at the DNA level by a 
variety of techniques. Nucleic acids used for diagnosis 
(genomic DNA, mRNA, etc.) may be obtained from a 
patient's cells, such as from blood, urine, saliva or 
tissue biopsy, e.g., chorionic villi sampling or removal 
of amniotic fluid cells and autopsy material. The 
genomic DNA may be used directly for detection or may be 
amplified enzymatically by using PCR, ligase chain 
reaction (LCR) , strand displacement amplification (SDA) , 
etc. prior to analysis. See, e.g., Saiki et al . , 
Nature, 324, 163-166 (1986), Be j , et al . , Crit. Rev. 
Biochem. Molec. Biol., 26, 301-334 (1991), Birkenmeyer 
et al.. J. Virol. Meth. , 35, 117-126 (1991), Van Brunt, 
J., Bio/Technology, 8, 291-294 (1990)). RNA or cDNA may 
also be used for the same purpose. As an example, PCR 
primers complementary to the nucleic acid of the instant 
invention can be used to identify and analyze GrbIR-1 
mutations. For example, deletions and insertions can be 
detected by a change in size of the amplified product in 
comparison to the normal GrbIR-1 genotype. Point 
mutations can be identified by hybridizing amplified DNA 
to rabiolabeled GrbIR-1 RNA of the invention or 
alternatively, radiolabeled GrbIR-1 antisense DNA 
sequences of the invention. Perfectly matched sequences 
can be distinguished from mismatched duplexes by RNase A 
digestion or by differences in melting temperatures 
(Tm) . Such a diagnostic would be particularly useful 
for prenatal and even neonatal testing. 

In addition, point mutations and other sequence 
differences between the reference gene and "mutant" 
genes can be identified by yet other well-known 
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techniques, e.g., direct DNA sequencing, single-strand 
conformational polymorphism. See Orita et al . , 
Genomics, 5, 874-879 (1989). For example, a sequencing 
primer is used with double- stranded PGR product or a 
5 single-stranded template molecule generated by a 

modified PGR. The sequence determination is performed 
by conventional procedures with radiolabeled nucleotides 
or by automatic sequencing procedures with fluorescent- 
tags. Cloned DNA segments may also be used as probes to 

10 detect specific DNA segments. The sensitivity of this 
method is greatly enhanced when combined with PCR. The 
presence of nucleotide repeats may correlate to a 
causative change in GrbIR-1 activity or serve as marker ' 
for various polymorphisms. 

15 Genetic testing based on DNA sequence differences 

may be achieved by detection of alteration in 
electrophoretic mobility of DNA fragments in gels with 
or without denaturing agents. Small sequence deletions 
and insertions can be visualized by high resolution gel 

20 electrophoresis. DNA fragments of different sequences 
may be distinguished on denaturing formamide gradient 
gels in which the mobilities of different DNA fragments 
are retarded in the gel at different positions according 
to their specific melting or partial melting 

25 temperatures. See, e.g., Myers et al . , Science, 230, 
1242 (1985). In addition, sequence alterations , in 
particular small deletions, may be detected as changes 
in the migration pattern of DNA heteroduplexes in non- 
denaturing gel electrophoresis such as heteroduplex 

30 electrophoresis. See, e.g., Nagamine et al., Am. J. 
Hum. Genet., 45, 337-339 (1989). Sequence changes at 
specific locations may also be revealed by nuclease 
protection assays, such as RNase and SI protection or 
the chemical cleavage method as disclosed by Cotton et 

35 al. in Proc. Natl. Acad. Sci. USA, 85, 4397-4401 (1985). 

Thus, the detection of a specific DNA sequence may 
be achieved by methods such as hybridization (e.g., 
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heteroduplex electroporation, see, White et al., 
Genomics, 12. 301-306 (1992), RNAse protection (e.g., 
Myers et al., Science. 230, 1242 (1985)) chemical 
cleavage le.g.T Cotton et al., Proc. Natl. Acad. Sci. 
USA, 85. 4397-4401 (1985))), direct DNA sequencing, or 
the use of restriction enzymes (e.g., restriction 
fragment length polymorphisms (RFLP) in which variations 
in the number and size of restriction fragments can 
indicate insertions, deletions, presence of nucleotide 
repeats and any other mutation which creates or destroys 
an endonuclease restriction sequence) . Southen blotting 
of genomic DNA may also be used to identify large (i.e., 
greater than 100 base pair) deletions and insertions. 

In addition to conventional gel electrophoresis and 
DNA sequencing, mutations such as microdeletions , 
aneuploidies, translocations, inversions, can also be 
detected by in situ analysis. See, e.g., Keller et al., 
DNA Probes, 2nd Ed., Stockton Press, New York, N.Y. , USA 
(1993). That is, DNA or RNA sequences in cells can be 
analyzed for mutations without isolation and/or 
immobilization onto a membrane. Fluorescence in situ 
hybridization (FISH) is presently the most commonly 
applied method and numerous reviews of FISH have 
appeared. See, e.g., Trachuck et al . , Science, 250. 
559-562 (1990), and Trask et al.. Trends, Genet.. 7, 
149-154 (1991). Hence, by using nucleic acids based on 
the structure of the GrbIR-1 genes, one can develop 
diagnostic tests for genetic mutations. 

In addition, some diseases are a result of, or are 
characterized by, changes in gene expression which can 
be detected by changes in the mRNA.. Alternatively, the 
GrblR-Lgene can be used as a reference to identify 
individuals expressing an increased or decreased level 
of GrbIR-1 protein, e.g.., by Northern blotting or in 
situ hybridization. 

Defining appropriate hybridization conditions is 
within the skill of the art. See, e.g., "Current 

28 
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Protocols in Mol. Biol . M Vol. I & II f Wiley 
Interscience . Ausbel et al. (eds.) (1992). Probing 
technology is well known in the art and it is 
appreciated that the size of ttfe probes can vary widely 
5 but it is preferred that the probe be at least 15 

nucleotides in length. It is also appreciated that -such 
probes can be and are preferably labeled with an 
analytically detectable reagent to facilitate 
identification of the probe. Useful reagents include 

.10 but are not limited to radioisotopes, fluorescent dyes 
or enzymes capable of catalyzing the formation of a 
detectable product. As a general rule, the more 
stringent the hybridization conditions the more closely 
related genes will be that are recovered. 

15 The putative role of GrbIR-1 in signal 

transduction of the insulin receptor pathway establishes 
yet another aspect of the invention which is gene 
therapy. "Gene therapy" means gene supplementation 
where an additional reference copy of a gene of interest 

20 is inserted into a patient's cells.. As a result, the 
protein encoded by the reference gene corrects the 
defect and permits the cells to function normally, thus 
alleviating disease symptoms. The reference copy would 
be a wild-type form of the GrJbIK-1 gene or a gene 

25 encoding a protein or peptide which modulates the 
activity of the endogenous GrbIR-1. 

Gene therapy of the present invention can occur in 
vivo or ex vivo. Ex vivo gene therapy requires the 
isolation and purification of patient cells, the 

30 introduction of a therapeutic gene and introduction of 
the genetically altered cells back into the patient. A 
replication-deficient virus such as a modified 
retrovirus can be used to introduce the therapeutic 
GrbIR-1 gene into such cells. For example, mouse 

35 Moloney leukemia virus (MMLV) is a well-known vector in 
clinical gene therapy trials. See, e.g., Boris-Lauerie 
et al., Curr. Opin. Genet. Dev., 3, 102-109 (1993). 
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In contrast, in vivo gene therapy does not require 
isolation and purification of a patient's cells. The 
therapeutic gene is typically "packaged" for 
administration-to a patient such" as in liposomes or in a 
5 replication-deficient virus such as adenovirus as 
described by Berkner, K.L., i n Curr. Top. Microbiol . 
Immunol., 158, 39-66 (1992) or adeno-associated virus 
(AAV) vectors as described by Muzyczka, N. , in Curr 
Top. Microbiol. Immunol., 158, 97-129 (1992) and U S 
10 Patent No. 5,252,479. Another approach: 'is 

administration of "naked DNA" in which the therapeutic 
gene is directly injected into the bloodstream or muscle 
tissue. Another approach is administration of "naked 
DNA" in which the therapeutic gene is introduced into ■ 
15 the target tissue by microparticle bombardment using 
gold particles coated with the DNA. 

Cell types useful for gene therapy of the present 
invention include lymphocytes, hepatocytes, myoblasts, 
fibroblasts, any cell of the eye such as retinal cells, 
•0 epithelial and endothelial cells. Preferably the cells 
are T lymphocytes drawn from the patient to be treated 
hepatocytes, any cell of the eye or respiratory or 
pulmonary epithelial cells.. Transfection of pulmonary 
epithelial cells can occur via inhalation of a 
5 neubulized preparation of DNA vectors in liposomes, DNA- 
protem complexes or replication-deficient adenoviruses. 
See, e.g., U.S. Patent No. 5,240,846. 

Another aspect of the invention is transgenic, non- 
human mammals capable of expressing the polynucleotides 
of the invention in any cell. Transgenic, non-human 
animals may be obtained by transfecting appropriate 
fertilized eggs or embryos of a host with the 
polynucleotides of the invention or with mutant forms 
_ found in human diseases. See, e.g.. u.S. Patent Nos. 
> 4,736,866; 5,175,385; 5,175,384 and 5,175,386. The 
resultant transgenic animal may be used as a model for 
the study of GrblR-l gene function or for producing 
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large amounts of grb-IR-1 protein for screening or 
crystallography purposes. Particularly useful 
transgenic animals are those which display a detectable 
phenotype associated with the expression of the GrbIR-1 
5 protein. Drug development candidates may then be 

screened for their ability to reverse or exacerbate the 
relevant phenotype. 

The present invention will now be described with 
10 reference to the following specific, non-limiting 
examples . 



Example 1 

GrbIR-1 full-length cDNA Clonincr and Sequence Analysis 

15 A search of a random cDNA sequence database 

consisting of short partial sequences known as expressed 
sequence tags (ESTs) with SH2 domain encoding sequences 
using the BLASTX algorithm disclosed an EST which was 
homologous to a murine epidermal growth factor receptor- 

20 binding protein grbl cDNA sequence reported by Margolis, 
B.L. et al. in Proc. Natl. Acad. Sci. USA 89, 8894-8898 
(1992) (SEQ ID NO: 3). The EST was originally isolated 
from a human cerebellum cDNA library. 

A 5' -rapid amplification of cDNA ends (5* RACE) 

25 protocol was used to isolate the 5' cDNA end of the 
putative human gene. Candidate 5* RACE products were 
amplified by PGR from a Xgtll human skeletal muscle 
library (Clontech cat no. HL1124b) . The PCR contained 
100 ng of phage DNA, a lambda-specific primer 

30 5 ' GATTGGTGGCGACGACTCC3 ' (SEQ ID NO: 4) and a gene- 
specific primer 5 ' CCCGTGAAACCAGTGCTGTG3 ' (SEQ ID NO: 5). 
Thirty cycles were conducted as follows: 94°C for 20 s, 

70°C to 55°C in 0 . 5°C increments /cycle for 30 s and 72°C 
for 2 min. A PCR product of 1.7 kb was purified and 
35 subcloned into pBluescript II and sequenced. Sequence 
analysis revealed the fragment to be the 5* end of the 
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gene, containing the remaining coding sequence, 
including the N-terminus. 

A cDNA encoding an intact coding sequence was 
assembled, a 3T4 kb PGR product 'was amplified from the 
5 EST using the primers T7 5 ' GTAATACGACTCACTATAGGGC3 ' (SEQ 
ID NO: 6) and 5 ' GGTAGCCAAAGTCCCCTCCA3 ' (SEQ ID NO: 7), 
and a 1.7 kb PCR product was amplified from the 5' RACE 
fragment isolated above using the primers 
5 ' GATTGGTGGCGACGACTCC3 ' (SEQ ID NO: 8) and 
10 5 ' TGGAGGGGACTTTGGCTACC3 ' (SEQ ID NO: 9) . The PCR 

conditions were 94°C for 15 s, 55°C for 20 s, 72°C for 4 
min., for 25 cycles. These products were combined by 
PCR in a second reaction containing each of the above 
PCR products and the primers 

5 ' GGAATTCCATGAATGCATCCCTGGAGAG3 ' (SEQ ID NO: 10) and 
5 ' CCCTCGAGTCATAAGGCCACTCGGATGC3 ' (SEQ ID NO: 11). The. 
PCR conditions were 94°C for 15 s, 45°C for 20 s, 72°C 
for 2 min,, for 25 cycles. The 1.6 kb secondary PGR 
product was treated with EcoRI and Xhol and subcloned 
into pGEX4T-3 (Pharmacia). The protein, is expressed in 
E. coli strain LE392 at moderate levels upon IPTG 
induction and is soluble. 

Independent confirmation of the existence of a mRNA 
corresponding to the full-length cDNA produced was 
25 carried out by RT-PCR. cDNA was prepared from 100 ng of 
human skeletal muscle polyA RNA (Clontech cat. no. 6541- 
1) using random hexamer primers and MoMLV reverse 
transciptase. One twentieth of the cDNA was used as 
template in a PCR reaction containing the following 
30 primers sets: Al/Pl, A2 /Pl, A2/P2, and A2/7-2 (Al: 
5 ' GTGAGCTGACCCTGCTGGAG3 1 (SEQ ID NO: 12); A2 : 
5 ' AGACCTAAGCCTGTTTGCTCC3 ' (SEQ ID NO: 13); PI : 
5 1 ACCGTGTCTGACTGCATGCT3 ' ( SEQ ID NO : 14 ) ; ' P2 : 
5 ' TGAAGTTCCCTTGGTGGAGC3 ' (SEQ ID NO 
35 5 ' CCCGTGAAACCAGTGCTGTG3 ' (SEQ ID NO 



20 



15) ; 7-2: 

16) ) . The expected 



288 bp, 203 bp, 954 bp and 1461 bp PCR fragments were 
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observed, respectively. The PGR conditions were 94°C for 
15 s, 70°C to 50°C in 0.5GC increment /cycle for 20 s # 
72°C for 2 min., for 40 cycles? Control reactions 
containing either no template or the 1.6 kb recombined 
cDNA produced above gave either no PCR product or the 
expected fragments. 

Sequence analysis of the full-length cDNA revealed 
a 1608 nucleotide open reading frame (SEQ ID NO: 1} 
encoding a 53 6 amino acid protein (SEQ ID NO: 2) with a 
predicted molecular mass of 59 kDa, starting with an ATG 
at position 289 and terminating with a TGA at position 
1897 of SEQ ID NO: 1. 

GenBank searches using the BLASTX and BLASTP 
algorithms with the full-length cDNA sequence or with 
the deduced amino acid sequence were carried out to 
identify homologous entries. The search results 
indicated that the isolated full-length cDNA is an 
alternatively spliced isoform of Grb-IR (Liu et al., 
supra, GenBank Accession U34355 (SEQ ID NO: 17 and SEQ 
ID NO: 18)) designated as GrbIR-l ( and is a member of 
the Grbl0/Grb7 family of SH2 adapter proteins. See Fig. 
1 for a multiple sequence alignment of GrbIR-1, Grb-IR, 
murine GrblO and human Grb7 . 

An/alignment of Grb-IR and GrbIR-1 using the GAP 
algorithm is shown in Fig. 2 (top, GrbIR-1; bottom, Grb- 
IR) . The overall amino acid identity was 99.6% with one 
gap. GrbIR-1 contains an insert which restores an 
incomplete pleckstrin homology (PH) domain in Grb-IR and 
GrbIR-1 contains a shortened N-terminus when compared 
with Grb-IR. The regions other than the C- terminal SH2 
domain and the PH domain did not show significant 
homologies to other database entries. 
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Example 2 
Tissue Distribution of GrbIR-1 

Northern blots of tissue mRNA were conducted to 
determine the tissue distribution of grbIR-1 gene 
transcription. The cDNA insert was amplified by PCR 
using the primers T3 and T7 and the 3.5 kb product was 
purified. Twenty-five ng of the PCR product was 
radiolabeled with [32P]-dATP using random hexamer 
primers and used to probe human multiple tissue Northern 
blots (Clontech cat. nos. 7760t1 and 7759-1). The 
membranes were washed at high stringency and exposed for 
6 hrs to a storage phosphor screen {Molecular Dynamics) 
for visualization. Expression of the corresponding mRNA 
was largely ubiquitous and variable in level in heart, 
brain, placenta, lung, liver, skeletal muscle, • kidney , 
pancreas, spleen, thymus, prostate, testes, ovaries, 
small intestine and colon, although absent from 
peripheral blood leukocytes. The mRNA is approximately 
5.6 kb in length. Highest expression was observed in 
heart, brain, skeletal muscle, and pancreas. Two 
additional transcripts are observed in skeletal muscle, 
of 4.8 and 3.1 kb. These may correspond to additional 
protein isoforms in this tissue. 

The present invention may be embodied in other 
specific forms without departing from the spirit or 
essential attributes thereof, and, accordingly, 
reference should be made to the appended claims, rather 
than to the foregoing specification, as indicating the 
scope of the invention. 
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SEQUENCE LISTING 
(1).. GENERAL INFORMATION V 
(i) APPLICANT: SmithKline Beecham Corporation and Harvard University 

(ii) TITLE OF THE INVENTION: GROWTH FACTOR RECEPTOR- BINDING INSULIN 
RECEPTOR 

(iii) NUMBER OF SEQUENCES: 18 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: SmithKline Beecham Corporation 

(B) STREET: 709 Swedeland Road 

(C) CITY: King of Prussia 

(D) STATE: PA 

(E) COUNTRY: USA 
{ F ) ZIP: 19406 

{v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette 

(B) COMPUTER: IBM Compatible 

(C) OPERATING SYSTEM: DOS 

(D) SOFTWARE: FastSEQ Version 1.5 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 09-JULY-1996 

(C) CLASSIFICATION: 

(Vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 
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(viii) ATTORNEY /AGENT INFORMATION: 

(A) NAME: Baumeister, Kirk 

(B) REGISTRATION NUMBER: 3 3,833 

(C) REFERENCE/ DOCKET NUMBER: * 'P50508P 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 610-270-5096 

(B) TELEFAX: 

(C) TELEX: 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 2505 base pairs 
(BJ TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 
<iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 



CGGCGCAACT 


TTGGCTCCCA 


GGGAACAAAC ATCCTCCTTC 


TAAGTGGTAG 


ATGTGGGTGA 


60 


GCTGACCCTG 


CTGGAGTCTG 


TCCCCTGGGC TACCCTCTGC 


TTCCCCCCAT 


TGTGAGTGGT 


120 


CCGTGAAGCA 


CAGCGTTGAC 


CAGACCTAAG CCTGTTTGCT 


CCCAGGACAA 


GGTGGAGCAG 


180 


ACACCTCGCA 


GTCAACAAGA 


CCCGGCAGGA CCAGGACTCC 


CCGCACAGTC 


TGACCGACTT 


240 


GCGAATCACC 


AGGAGGATGA 


TGTGGACCTG GAAGCCCTGG 


TGAACGATAT 


GAATGCATCC 


300 


CTGGAGAGCC 


TGTACTCGGC 


CTGCAGCATG CAGTCAGACA 


CGGTGCCCCT 


CCTGCAGAAT 


360 


GGCCAGCATG 


CCCGCAGCCA 


GCCTCGGGCT TCAGGCCCTC 


CTCGGTCCAT 


CCAGCCACAG 


420 


GTGTCCCCGA 


GGCAGAGGGT 


GCAGCGCTCC CAGCCTGTGC 


ACATCCTCGC 


TGTCAGGCGC 


480 


CTTCAGGAGG 


AAGACCAGCA 


GTTTAGAACC TCATCTCTGC 


CGGCCATCCC 


CAATCCTTTT 


540 


CCTGAACTCT 


GTGGCCCTGG 


GAGCCCCCCT GTGCTCACGC 


CGGGTTCTTT 


ACCTCCGAGC 


600 


CAGGCCGCCG 


CAAAGCAGGA 


TGTTAAAGTC TTTAGTGAAG 


ATGGGACAAG 


CAAAGTGGTG 


660 


GAGATTCTAG 


CAGACATGAC 


AGCCAGAGAC CTGTGCCAAT 


TGCTGGTTTA 


CAAAAGTCAC 


720 


TGTGTGGATG ACAACAGCTG 


GACACTAGTG GAGCACCACC 


CGCACCTAGG 


ATTAGAGAGG 


780 


TGCTTGGAAG 


ACCATGAGCT 


GGTGGTCCAG GTGGAGAGTA 


CCATGGCCAG 


TGAGAGTAAA 


840 


TTTCTATTCA 


GGAAGAATTA 


CGCAAAATAC GAGTTCTTTA 


AAAATCCCAT 


GAATTTCTTC 


900 
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CCAGAACAGA 


TGGTTACTTG 


GTGCCAGCAG 


TV* & & & inppr ft 


pmra ft aprri 


uL 111 luLnu 




AATTT1 LTGA 


ft f TV^/^Rfrpft/** 

AL I V-LAG TACj 


x xoxLL ioAA 


ft. TTP A A P.PJ'ST 


mrp« j ■.. 1 1/^ R TYIT 
ill luwniui 




1020 


r*r* ft A ft ✓■"» n ft ft rr\ 
GGAAAGAAA I 


ft TV"*^ ft ft ft ft ft 

t. ATGG AAAAA 




TP TTTfi HClfZ A 


r: a tv^Tft/VPT 






ACCAAGGGAA 


/"'/T'.'TV** ft ft R Cr* Ti 

CTTCAAAGGA 


m A /** ft ft P 

ACCCAGACAL 




ttpppp a Of** m 
1 o*jv-HjAL.C I 


fzf* a nci a r* Arte 




•AACATCTTCT 


CCCTGATCGC 


nrv*/"'*/* ft O/** ft ft o 

TGGCAGGAAG 


GAG i ALAALVj 






X ^ w U 


TGC Al AAAGC 


or a ft ft ft ft/™"p 
U AAAL. AAAU I 


P ft OP ft ft TV"* ft ft 






^ i v7 i n\j 


1260 


/"* ft O/"' ft /^/^ ft R ft 

GACGAGCAAA 




C I uoa 1 vj AL. A 




TPPTP A Af^T A 


Tfi A A ATflt^TC 


1320 


f *1 ■ U I'll I't JV R /"* ft 

CTTT ACC AG A 


A mm ft /"» ft ft 

ATTACCGAAT 


ppo'pp a ftf' 




fry* TrrPP 


riT^pcTCC APH 


1380 


CCAGTGCGCA 


GTG xt-Tut-oA 


GAAt, I (.UL. IV- 




nl 1 1 1 I V_ i vjvj 




1440 


CGCGTGATAG 


AGAATCCGGC 


GGAGGCCCAG 




T^r* a rr* a f^rr* 

l VjVjAVjVjA*j (jVj 


r*p A P^PPTYtr^ 
CV.AV.W7V.V. 100 


1 Sf)0 


AGGAAGCGAA 


GCACACGGAT 


GAACATCCTA 


GGTAGCCAAA 


GTCCCCTCCA 


CCCTTC1 AUL 


1D0U 


CTAAGTACAG 


TGATTCACAG 


GACACAGCAC 


TGGTTTCACG 


GGAGGTTCTL 


CAGGGAGoAA 




TCCCACAGGA 


TCATTAAACA 


GCAAGGGCTC 


GTGGATOGGC 


TTTTTCTCCT 


LLU 1 uA\,nu\, 


1 fiftO 

iOOU 


CAGAGTAATC 


CAAAGGCATT 


TGTACTCACA 


CTGTGTCATC 


Krr* ft o a a a a T 1 
ACv_AvjAAAA I 


m a a a a iTwr 
1AAAAA1 1 




CAGATCTTAC 


CTTGCGAGGA 


CGACGGGCAG 


ACGTTCTTCA 


/"•/** /^'P ft O ft rn/^« n 

GCCTAGATGA 


f'r^T'C ft ft O ft P , /~ > 


1 flnn 

luuU 


AAATTCTCTG 


ACCTGATCCA 


GCTGGTTGAC 


TTTTACCAGC 


m/> x »v o ft ft ftO/*» 

TGAACAAAGG 


ft PTppTfrr 1 ^ 


IOOU 


TGCAAACTCA 


AGCACCACTG 


CATCCGAGTG 


GCCTTATGAC 


GGC AG A 1 U i L 


PTrTPP^PTYI 
L. 1 L. i Lwvv 1 u 


X> ZJ £t U 


AAGACTGGAG 


GAAGTGAACA 


CTGGAGTGAA 


GAAGCGGTCT 


GTGCGT1 GG I 


pahpanpipa 
O AAuAAL AV- A 


1 J 0 u 


CATCGATTCT 


GCACCTGGGG 


ACCCAGAGCG 


» ft m/™< « i x i irp 

AGATGGGTTT 




U 1 Av. CfvrV 


2040 


GATTGACTAG 


TTTGTTGGAC 


TTAAACGACG 


ATT IXjL. 1 vjL i 






2100 


friz's fy^\fr\^^fT\^\ 

TCCCTCTGCG 


TCGGNCAAAT 


TGGGGAGGGL 


a a a n & to 


f A f"2 r*Cf2 A A A f2 


TTGAAAATAA 


2160 


ACTGGAATGA 


TCATCTTGGC 


TTGGGCCGCT 


TAGGAACAAG 


AACCGGAGAG 


AAGTGATTGG 




AAATGAACTC 


TTGCCCTGGA 


ATAATCTTGA 


CaATTAAAAC 


TGATATGTTT 


ACTTTTTTTG 


2260 


TATTGATCAC 


TTTTTTGGAC 


TCCTTCTTTG 


TTTTCAATAT 


TGTATTCAGC 


CTATTGTAGG 


2340 


AGGGGGATGT 


GGCGTTTCAA 


CTCATATAAT 


ACAGAAAGAG 


TTTTGGAATG 


GGCAGATTTC 


2400 


AAACTGAATA 


TGGGTCCCCA 


AATGTTCCCA 


GAGGGTCCTC 


CACAACCTCT 


GNCGACTACC 


2460 


ACGGTGTNGG 


ATTCAGCTCC 


CAAATGACAA 


ACCCAGNCCT 


TCCCA 




2505 



(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 53 6 amino acids 
< B ) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 
<iv> ANT I SENSE: NO 

(v) FRAGMENT TYPE: N- terminal 

(vi) ORIGINAL SOURCE: 
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Met Asn Ala 
1 

Asp Thr Val 

Arg Ala Ser 
35 

Gin Arg Val 
50 

Leu Gin Glu 
65 

Pro Asn Pro 



Ser Leu 

-5 

Pro Leu 
20 

Gly Pro 
Gin Arg 
Glu Asp 



Phe Pro 
85 

Thr Pro Gly Ser Leu 
100 

Ser Glu 



Lys Val Phe 
115 

Asp Met Thr 

130 
Cys Val Asp 
145 

Gly Leu Glu 

Ser Thr Met 

Lys Tyr Glu 
195 

Val Thr Trp 

210 
Asn Phe Leu 
225 

Val Lys Glu 



Ala Arg 

Asp Asn 

Arg Cys 
165 
Ala Ser 
180 

Phe Phe 
Cys Gin 
Asn Ser 



Glu Ser 

Leu Gin 

Pro Arg 

Ser Gin 

55 
Gin Gin 
70 

Glu Leu 

Pro Pro 

Asp Gly 

Asp Leu 
135 
Ser Trp 
150 

Leu Glu 



Leu Tyr Ser 
10- 

Asn Gly Gin 
25 

Ser lie Gin 
40 

Pro Val His 

Phe Arg Thr 

Cys Gly Pro 
90 

Ser Gin Ala 
105 

Thr Ser Lys 
120 

Cys Gin Leu 
Thr Leu Val 



Leu Gly 
245 

Arg Arg Ser Gly Leu 
260 

Gin Leu 



Arg His Leu 
275 

Leu lie Ala 

290 
Cys lie Lys 
305 



Gly Arg 



Pro Asn 



Asp His Glu 
170 

Glu Ser Lys Phe Leu 
185 

Lys Asn Pro Met Asn 
200 

Gin Ser Asn Gly Ser 
215 

Ser Ser Cys Pro Glu 
230 

Lys Lys Ser Trp Lys 
250 

Ser Thr Lys 
265 

Asp Leu Glu 
280 

Tyr Asn Ala 



Ala Cys Ser Met 

His Ala Arg Ser 
30 

Pro Gin Val Ser 
45 

lie Leu Ala Val 
60 

Ser Ser Leu Pro 
75 

Gly Ser Pro Pro 

Ala Ala Lys Gin 
110 

Val Val Glu lie 
125 

Leu Val Tyr Lys 
140 

Glu His His Pro 
155 

Leu Val Val Gin 

Phe Arg Lys Asn 
190 

Phe Phe Pro Glu 
205 

Gin Thr Gin Leu 
220 

He Gin Gly Phe 
235 

Lys Leu Tyr Val 



Gin Ser 
15 

Gin Pro 

Pro Arg 

Arg Arg 

Ala He 

80 
Val Leu 
95 

Asp Val 

Leu Ala 

Ser His 

His Leu 
160 
Val Glu 
175 

Tyr Ala 
Gin Met 
Leu Gin 



Tyr Cys 

Leu Ala 

Lys Gin 
295 
Lys Val 
310 



Arg Asn Glu 



Gly Thr Ser Lys 
270 

Asp Ser Asn He 
285 

Pro Thr Asp His 
300 

Thr Lys Glu Leu 
315 



Leu His 
240 
Cys Leu 
255 

Glu Pro 

Phe Ser 

Gly Leu 

Arg Leu 
320 
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Leu Cys Ala Glu Asp 
325 

Arg Leu Leu Lys Tyr 

340-.- 

Gin Gin Arg. Lys Ala 
355 

Val Ser Glu Asn Ser 
370 

Arg Val He Glu Asn 
385 

Gly His Ala Trp Arg 
405 

Gin Ser Pro Leu His 
420 

Gin His Trp Phe His 
435 

He Lys Gin Gin Gly 
450 

Gin Ser Asn Pro Lys 
465 

He Lys Asn Phe Gin 
485 

Phe Ser Leu Asp Asp 
500 

Val Asp Phe Tyr Gin 
515 

His His Cys He Arg 
530 



Glu Gin Thr Arg Thr Cys 
330 

Glu Met Leu Leu Tyr Gin 
345V 

Leu Leu Ser Pro Phe Ser 
360 

Leu Val Ala Met Asp Phe 
375 

Pro Ala Glu Ala Gin Ser 
390 395 
Lys Arg Ser Thr Arg Met 
410 

Pro Ser Thr Leu Ser Thr 
425 

Gly Arg Phe Ser Arg Glu 
440 

Leu Val Asp Gly Leu Phe 
455 

Ala Phe Val Leu Thr Leu 
470 475 
He Leu Pro Cys Glu Asp 
490 

Gly Asn Thr Lys Phe Ser 
505 

Leu Asn Lys Gly Val Leu 
520 

Val Ala Leu 
535 
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Trp Met Thr Ala Phe 
335 

Asn Tyr Arg He Pro 
350 

Thr Pro Val Arg Ser 
365 

Ser Gly Gin Thr Gly 
380 

Ala Ala Leu Glu Glu 
400 

Asn He Leu Gly Ser 
415 

Val He His Arg Thr 
430 

Glu Ser His Arg He 
445 

Leu Leu Arg Asp Ser 
460 

Cys His His Gin Lys 
480 

Asp Gly Gin Thr Phe 
495 

Asp Leu He Gin Leu 
510 

Pro Cys Lys Leu Lys 
525 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2420 base pairs 

(B) TYPE: nucleic acid 
CO STRAND EDNESS : single 
(D) TOPOLOGY: linear 



{ ii ) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
<iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

AATAATTCTC AAATTTTTCT TACTTACCTA AATATAAGCT AATTTCTATA * ACTCTAATTC 60 

CTCAAAAGGT ACTCCCTCTC TCTCTCTCTC TCTCTCCCTC TCTCCTAGCA CCTGCTGCTC 120 

AGTAGGAAGG GCAAGAGCAA TTCGAGGCCG GTGCATTGTG AGGAGTCTCC ACCCCTCCTC 180 

CTGCGCTTCC TTCTCCAGGG AGCCTCTCAG GCCGCCCTCA CCTGCCCGAG ATAATTTTAG 240 

TTTCCCTGGG CCTGGAATCT GGATACGCAG GGCCTCGCTC TATATTCTCC CGCCTCAACA 300 

TTCCAAAGGC GGGATAGCCT TTCTACCATC TGTAGAGAAG AGAGAAAGGA TTCGAAATCA 360 

AATCCAAGTG TCTGGGATCT CTAGACAGAG CCAGACTTTG GGCCGGGTGT CCGGCTCCTT 420 

CTGTTGGAGG TGCTCCAGGT GCCATGGAAC TGGATCTGAG CCCGACTCAT CTCAGCAGCT 480 

CCCCAGAAGA TGTGTGCCCA ACTCCTGCTA CCCCTCCTGA GACTCCTCCG CCCCCTGATA 540 

ACCCTCCGCC AGGGGATGTG AAGCGGTCGC AGCCTTTGCC CATCCCCAGC AGCAGGAAAC 600 

TTCGAGAAGA GGAGTTTCAG GCAACCTCTC TGCCCTCCAT CCCCAACCCC TTCCCTGAGC 660 

TCTGCAGCCC ACCTTCACAG AAACCCATTC TTGGTGGTTC CTCCGGTGCA AGGGGGTTGC 720 

TTCCTCGAGA CTCCAGCCGC CTCTGTGTGG TGAAGGTGTA CAGTGAGGAT GGGGCCTGCC 780 

GGTCTGTGGA GGTGGCAGCG GGCGCCACAG CTCGTCACGT GTGTGAGATG CTGGTACAAC 840 

GAGCTCACGC CCTGAGCGAC GAGAGCTGGG GACTAGTGGA ATCCCACCCC TACCTGGCAC 900 

TGGAGCGGGG TCTGGAGGAC CATGAATTTG TGGTGGAAGT GCAGGAGGCC TGGCCTGTGG 960 

GTGGAGATAG CCGCTTCATC TTCCGTAAAA ACTTCGCCAA GTATGAACTA TTCAAGAGCC 1020 

CCCCACACAC CCTGTTTCCA GAAAAGATGG TCTCGAGCTG TCTGGATGCA CAAACAGGCA .1080 

TATCCCATGA AGACCTCATC CAGAACTTCC TGAACGCTGG CAGCTTCCCT GAGATCCAGG 1140 

GCTTCCTGCA GCTGCGGGGA TCAGGCCGGG GGTCAGGTCG AAAGCTTTGG AAACGTTTCT 1200 

TCTGCTTTCT GCGTCGATCT GGCCTCTACT ACTCTACCAA GGGTACCTCC AAGGACCCCA 1260 

GACACCTACA GTATGTGGCA GATGTGAATG AGTCCAATGT CTATGTGGTG ACCCAGGGCC 1320 

GCAAGCTGTA TGGGATGCCC ACTGACTTCG GCTTCTGTGT CAAGCCCAAC AAGCTTCGAA 1380 

ACGGCCACAA .GGGGCTCCAC ATCTTCTGCA GTGAGGATGA GCAGAGTCGG ACCTGCTGGC 1440 

TGGCTGCCTT CCGGCTCTTC AAGTACGGGG TACAGCTATA TAAGAATTAT CAGCAGGCCC 1500 

AGTCTCGTCA CCTGCGCCTA TCCTATTTGG GGTCTCCACC CTTGAGGAGC GTCTCAGACA 1560 

ATACCCTAGT GGCTATGGAC TTCTCTGGCC ATGCGGGGCG TGTCATTGAT AACCCCCGGG 1620 

AAGCTCTGAG TGCCGCCATG GAGGAGGCCC AGGCCTGGAG GAAGAAGACA AACCACCGTC 1680 

TGAGCCTGCC CACCACATGC TCTGGCTCGA GCCTCAGCGC AGCCATTCAT CGCACCCAGC 1740 

CCTGGTTTCA TGGACGCATC TCTCGGGAGG AGAGCCAGCG GCTAATTGGA CAGCAGGGCC 1800 

TGGTGGATGG TGTGTTCCTG GTCCGGGAGA GCCAGAGGAA CCCACAGGGC TTTGTCCTGT 1860 

CCTTGTGCCA TCTGCAGAAA GTCAAGCATT ATCTCATTTT GCCAAGTGAA GATGAAGGTT 1920 

GCCTTTACTT CAGCATGGAT GAGGGCCAGA CCCGTTTCAC AGACCTGCTG CAGCTGGTAG 1980 

AATTCCACCA GCTGAACCGA GGCATCCTGC CCTGCCTGCT GCGCCACTGC TGTGCCCGTG 2040 

TGGCCCTCTG AGGCCGCACA AGCTACTGCA GCCATGGGTT x TGCCTACCAC CCTTCTGTCC 2100 

TGTGGACTCG GTGCAGGTGG GTGGGGTGGT AAACAGTGGA AGAGCTCCCC CCCCCAATTT 2160 

TATCCCATTT TTTTTAACCT CTCTCAACCA GTGAAACATC CCCTAACCCT GTCCATCCCT 2220 

GACTCCTGTC CCCAAGGGAG GCATTGTGGT CCTGTCCCCT TGGTAGAGCT CCTGAGGTAC 2280 

TGTTCCAGTG AGGGGCATTA TGAGAGGAGC GGGGCAGCCC AGGAGGTCTC ATACCCCACC 2340 
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CATAATCTGT ACAGACTGAG. AGGCCAGTTG ATCTGCTCTG TTTTATACCA GTAACAATAA 2400 
AGATTATTTT TTGATACAAA 2420 

( 2 >„„. INFORMATION FOR SEQ Iff. NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 
{A) LENGTH: 19 base pairs 
(B> TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 
(iv> ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 
GATTGGTGGC GACGACTCC 19 
(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
(Di TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE : NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:5: 
CCCGTGAAAC CAGTGCTGTG 20 
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(2) INFORMATION FOR SEQ ID NO:6: 
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(i) SEQUENCE CHARACTERISTICS: 
{A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI SENSE : NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
GTAATACGAC TCACTATAGG GC 

(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:7 : 
GGTAGCCAAA GTCCCCTCCA 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS : single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : cDNA 

(iii) HYPOTHETICAL : NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8 
GATTGGTGGC GACGACTCC . 

(2) INFORMATION FOR SEQ ID NO : 9 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:9 
TGGAGGGGAC TTTGGCTACC 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 8 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 
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(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
GGAATTCCAT GAATGCATCC CTGGAGAG 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
<iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
CCCTCGAGTC ATAAGGCCAC TCGGATGC 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 
(v> FRAGMENT TYPE: 
(vi) ORIGINAL SOURCE: 

fxi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
GTGAGCTGAC CCTGCTGGAG 
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(2) INFORMATION FOR SEQ ID NO: 13: 

( i ) SEQUENCE CHARACTERISTICS . 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
(b) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANT I SENSE : NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi> SEQUENCE DESCRIPTION: SEQ ID NO:13: 
AGACCTAAGC CTGTTTGCTC C 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI SENSE: NO 
(V) FRAGMENT TYPE: 
(Vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
ACCGTGTCTG ACTGCATGCT 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA * ■ 

( i i i ) HYPOTHETICAL : NO 
.- (iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

fxi) SEQUENCE DESCRIPTION: SEQ ID NO: 15 
TGAAGTTCCC TTGGTGGAGC 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
(D> TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
CCCGTGAAACCAGTGCTGTG 

(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 2070 base pairs 
IB) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 
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(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ. ID NO: 17 : 



AAATGTAATT 


TGAAGAAGGC 


AGAAGGAACC 


CATGGCTTTA 


GCCGGCTGCC 


CAGATTCC TT 


oO 


TTTGCACCAT 


CCGTACTACC 


AGGACAAGGT 


GGAGCAGACA 


CCTCGCAGTC 


AACAAGACCC 


120 


GGCAGGACCA 


GGACTCCCCG 


CACAGTCTGA 


CCGACTTGCG 


AATCACCAGG 


AGGATGATGT 


180 


GGACCTGGAA 


GCCCTGGTGA 


ACGATATGAA 


TGCATCCCTG 


GAGAGCCTGT 


ACTCGGCCTG 


240 


CAGCATGCAG 


TCAGACACGG 


TGCCCCTCCT 


GCAGAATGGC 


CAGCATGCCC 


GCAGCCAGCC 


300 


TCGGGCTTCA 


GGCCCTCCTC 


GGTCCATCCA 


GCCACAGGTG TCCCCGAGGC 


AGAGGGTGCA 


360 


GCGCTCCCAG 


CCTGTGCACA 


TCCTCGCTGT 


CAGGCGCCTT 


CAGGAGGAAG 


ACCAGCAGTT 


420 


TAGAACCTCA 


TCTCTGCCGG 


CCATCCCCAA 


TCCTTTTCCT 


GAACTCTGTG 


GCCCTGGGAG 


480 


CCCCCCTGTG 


CTCACGCCGG 


GTTCTTTACC 


TCCGAGCCAG 


GCCGCCGCAA 


AGCAGGATGT 


540 


TAAAGTCTTT 


AGTGAAGATG 


GGACAAGCAA 


AGTGGTGGAG 


ATTCTAGCAG 


ACATGACAGC 


600 


CAGAGACCTG 


TGCCAATTGC 


TGGTTTACAA 


AAGTCACTGT 


GTGGATGACA 


ACAGCTGGAC 


660 


ACTAGTGGAG 


CACCACCCGC 


ACCTAGGATT 


AGAGAGGTGC 


TTGGAAGACC 


ATGAGCTGGT 


720 


GGTCCAGGTG 


GAGAGTACCA 


TGGCCAGTGA 


GAGTAAATTT 


CTATTCAGGA 


AGAATTACGC 


780 


AAAATACGAG 


TTCTTTAAAA 


ATCCCATGAA 


TTTCTTCCCA 


GAACAGATGG 


TTACTTGGTG 


840 


CCAGCAGTCA 


AATGGCAGTC 


AAACCCAGCT 


TTTGCAGGAA 


CCCAGACACC 


TGCAGCTGCT 




GGCCGACCTG 


GAGGACAGCA 


ACATCTTCTC 


CCTGATCGCT 


GGCAGGAAGC 


AGTACAACGC 


960 


CCCTACAGAC 


CACGGGCTCT 


GCATAAAGCC 


AAACAAAGTC 


AGGAATGAAA 


CTAAAGAGCT 


1020 


GAGGTTGCTC 


TGTGCAGAGG 


ACGAGCAAAC 


CAGGACGTGC 


TGGATGACAG 


CGTTCAGACT 


1080 


CCTCAAGTAT 


GGAATGCTCC 


TTTACCAGAA 


TTACCGAATC 


CCTCAGCAGA GGAAGGCCTT 


1140 


GCTGTCCCCG 


TTCTCGACGC 


CAGTGCGCAG 


TGTCTCCGAG 


AACTCCCTCG 


TGGCAATGGA 


1200 


TTTTTCTGGG 


CAAACAGGAC 


GCGTGATAGA 


GAATCCGGCG 


GAGGCCCAGA 


GCGCAGCCCT 


1260 


GGAGGAGGGC 


CACGCCTGGA 


GGAAGCGAAG 


CACACGGATG 


AACATCCTAG 


GTAGCCAAAG 


1320 


TCCCCTCCAC 


CCTTCTACCC 


TAAGTACAGT 


GATTCACAGG 


ACACAGCACT 


GGTTTCACGG 


1380 


GAGGATCTCC 


AGGGAGGAAT 


CCCACAGGAT 


CATTAAACAG 


CAAGGGCTCG 


TGGATGGGCT 


1440 


TTTTCTCCTC 


CGTGACAGCC 


AGAGTAATCC 


AAAGGCATTT 


GTACTCACAC 


TGTGTCATCA 


1500 


CCAGAAAATT 


AAAAATTTCC 


AGATCTTACC 


TTGCGAGGAC 


GACGGGCAGA 


CGTTCTTCAG 


1560 


CCTAGATGAC 


GGGAACACCA 


AATTCTCTGA 


CCTGATCCAG 


CTGGTTGACT 


TTTACCAGCT 


1620 


GAACAAAGGA 


GTCCTGCCTT 


GCAAACTCAA 


GCACCACTGC 


ATCCGAGTGG 


CCTTATGACC 


1680 


GCAGATGTCC 


TCTCGGCTGA 


AGACTGGAGG 


AAGTGAACAC 


TGGAGTGAAG 


AAGCGGTCTG 


1740 


TGCGTTGGTG 


AAGAACACAC 


ATCGATTCTG 


CACCTGGGGA 


CCCAGAGCGA 


GATGGGTTTG 


1800 


TTCGGTGCCA 


GCCGACCAAG 


ATTGACTAGT 


TTGTTGGACT 


TAAACGACGA 


TTTGCTGCTG 


1860 


TGAACCCAGC 


AGGGTCGCCT 


CCCTCTGCGT 


CGGCCAAATT GGGGAGGGCA TGGAAGATCC 


1920 


AGCGGAAAGT 


TGAAAATAAA 


CTGGAATGAT 


CATCTTGGCT 


TGGGCCGCTT 


AGGAACAAGA 


1980 


ACCGGAGAGA 


AGTGATTGGA 


AATGAACTCT 


TGCCCTGGAA 


TAATCTTGAC 


AATTAAAACT 


2040 


GATATGTTTA 


AAAAAAAAAA 


AAAAAAAACT 








2070 
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12) INFORMATION FOR SEQ ID NO: 18: 

<i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH, 548 amino acids *• 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(iil MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: N- terminal 

(vi) ORIGINAL SOURCE: 

<xi) SEQUENCE DESCRIPTION : SEQ ID NO: 18: 

« «. U. U. o, r c* P„ Asp s « r ph , Uu „. ^ 

10 

Gin Asp Lys Val Glu Gin Thr Pro Arg Ser Gin n , 

20 rg Ser Gln G1 " Asp Pro Ala Gly 

25 30 



- o ly u. Pro Ala „ ser Asp lr3 Lea su ^ 

40 

vu *«, u. «. A1 , ^ Vsl As „ Asp „ et Asn £ ^ 

S« U. *r S.r A1 , ^ S „ G1 „ As p ^ v ^ Wo ^ 

«. «- «y «. ». 1 Ar3 ser G1 „ p „ £ u> ^ « 

*. ~ ,u ci„ P „ Gln val s „ Pro ^ 01n » ^ 

105 

«- - *i „. ne _ Ala Vil Ar9 flra ^ »• 

120 

Gln Phe Arg Thr Ser Ser Leu Pro Ala II* p. , ^ 

130 .« 16 Pr ° Asn Pro "» Pro Glu 

140 

u. C « y ,„ »„ p „ pro val ^ ^ my ^ ^ ^ 

155 

*, 01 „ Ala „ AU Lya 0l „ Mp ^ ^ ^ ^ £ 

55 170 
Gly Thr Ser Lys Val Val n„ ti r , 175 

^ Val Glu He Leu Ala Asp Met Thr Ala Arg Asp 

Leu Cys «„ Leu Leu Val Tyr Lys sir His Cys Val Asn T > 

15 c ^ ys Vai A sp Asp Asn Ser 
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Trp Thr Leu Val Glu His His Pro His Leu Gly Leu Glu Arg Cys Leu 

210 215 220 

Glu Asp His Glu Leu Val Val Gin Val Glu Ser Thr Met Ala Ser Glu 

225 _„... 230 *. 235 • 240 

Ser Lys Phe Leu Phe Arg Lys Asn Tyr Ala Lys Tyr Glu Phe Phe Lys 

245 250 255 

Asn Pro Met Asn Phe Phe Pro Glu Gin Met Val Thr Trp Cys Gin Gin 

260 265 270 

Ser Asn Gly Ser Gin Thr Gin Leu Leu Gin Glu Pro Arg His Leu Gin 

275 280 . 285 

Leu Leu Ala Asp Leu Glu Asp Ser Asn lie Phe Ser Leu lie Ala Gly 

290 295 300 

Arg Lys Gin Tyr Asn Ala Pro Thr Asp His Gly Leu Cys lie Lys Pro 
305 310 315 320 

Asn Lys Val Arg Asn Glu Thr Lys Glu Leu Arg Leu Leu Cys Ala Glu 

325 330 335 

Asp Glu Gin Thr Arg Thr Cys Trp Met Thr Ala Phe Arg Leu Leu Lys 

340 345 350 

Tyr Gly Met Leu Leu Tyr Gin Asn Tyr Arg lie Pro Gin Gin Arg Lys 

355 360 365 

Ala Leu Leu Ser Pro Phe Ser Thr Pro Val Arg Ser Val Ser Glu Asn 

370 375 380 

Ser Leu Val Ala Met Asp Phe Ser Gly Gin Thr Gly Arg Val He Glu 
385 390 395 400 

Asn Pro Ala Glu Ala Gin Ser Ala Ala Leu Glu Glu Gly His Ala Trp 

405 410 415 

Arg Lys Arg Ser Thr Arg Met Asn He Leu Gly Ser Gin Ser Pro Leu 

420 425 430 

His Pro Ser Thr Leu Ser Thr Val He His Arg Thr Gin His Trp Phe 

435 440 445 

His Gly Arg lie Ser Arg Glu Glu Ser His Arg He He Lys Gin Gin 

450 455 460 

Gly Leu Val Asp Gly Leu Phe Leu Leu Arg Asp Ser Gin Ser Asn Pro 
465 470 475 480 

Lys Ala Phe Val Leu Thr Leu Cys His His Gin Lys He Lys Asn Phe 

485 490 495 

Gin He Leu Pro Cys Glu Asp Asp Gly Gin Thr Phe Phe Ser Leu Asp 

500 505 510 

Asp Gly Asn Thr Lys Phe Ser Asp Leu lie Gin Leu Val Asp Phe Tyr 
515 520 525 
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Gin Leu Asn Lys Gly Val Leu Pro Cys Lys Leu Lys His His Cys H e 
530 535 S40 

Arg Val Ala Leu 

545 
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CLAIMS 

1. An isolated polynucleotide selected from the 
group consisting of: 

(a) a. polynucleotide encoding human GrbIR-1 having 
the nucleotide sequence as set forth in SEQ id N0.-1 from 
nucleotide 289 to 1897; 

(b) a polynucleotide capable of hybridizing to the 
complement of a polynucleotide according to (a) under 
moderately stringent hybridization conditions and which 
encodes a functional human GrbIR-1; and 

(c) a degenerate polynucleotide according to (a) 
or (b). 

2. An isolated polynucleotide having the 
nucleotide sequence as set forth in SEQ ID N0:1. 
15 . 3 - A fu "ctional polypeptide encoded by the 

polynucleotide of claim 1. 

4. The functional polypeptide of claim 3 which is 
human GrbIR-1 having the amino acid sequence set forth 
in SEQ ID NO: 2. 
20 5. 



10 



30 



35 



DNA. 



The polynucleotide of claim 1 which is DNA. 
6. The polynucleotide of claim 5 which is genomic 



7. 



The polynucleotide of claim 1 which is RNA. 



8. A vector comprising the DNA of claim 5. 
25 9 - A recombinant host cell comprising the vector 

of claim 8. 

10. A method for preparing essentially pure human 
GrbIR-1 protein comprising culturing the recombinant 
host cell of claim 9 under conditions promoting 
expression of the protein and recovering the expressed 
protein. 

11. Human GrbIR-1 produced by the process of claim 

10. 

12. An antisense oligonucleotide comprising a 
sequence which is capable of binding to the 
polynucleotide of claim 1. 

13 . A modulator of the polypeptide of claim 3 . 
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14. The modulator of claim 13 which is a peptide. 

15. The modulator of claim 13 which is a small 
organic molecule. 

* 

16. The"small organic molecule of claim 15 which 
5 is a peptidomimetic . 

17. A method for assaying a medium for the 
presence of a substance that modulates GrbIR-1 activity 
by affecting the binding of GrbIR-1 to cellular binding 
partners comprising the steps of: 

10 (a) providing a GrbIR-1 protein having the 

amino acid sequence of GrbIR-1 (SEQ ID NO: 2) or a 
functional derivative thereof and a cellular binding 
partner or synthetic analog thereof; 

(b) incubating with a test substance which is 
15 suspected of modulating GrbIR-1 activity under 

conditions which permit the formation of a GrbIR-1 
protein/cellular binding partner complex; 

(c) assaying for the presence of the complex, 
free GrbIR-1 protein or free cellular binding partner; 

20 and 

(d) comparing to a control to determine the 
effect of the substance. 

18. GrbIR-1 protein modulating compounds 
identified by the method of claim 17. 

25 19. A method for assaying for the presence of a 

substance that modulates GrbIR-1 activity by direct 
binding to GrbIR-1 protein comprising the steps of: 

(a) providing a labelled GrbIR-1 protein 
having the amino acid sequence of GrbIR-1 { SEQ ID NO: 2) 

3 0 or a functional derivative thereof; 

(b) providing solid-support-associated 
modulator candidates; 

(c) incubating a mixture of the labelled 
GrbIR-1 protein with the support-associated modulator 

3 5 candidates under conditions which can permit the 
formation of a GrbIR-1 protein/modulator candidate 
complex; 
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(d) separating the solid support from free 
soluble labelled GrbIR-1 protein; 

(e) assaying for the presence of solid 
support-associated labelled protein; 

5 (f) isolating the solid support complexed 

with labelled GrbIR-1 protein; and 

(g) identifying the modulator candidate. 
20. GrbIR-1 modulating compounds identified by the 
method of claim 19. 
10 21. A method for the treatment of a patient having 

, need to modulate GrbIR-1 activity comprising , 

administering to the patient a therapeutically effective 
amount of the modulating compound of claims 18 or 20. 

22. A pharmaceutical composition comprising the 
15 modulating compound of claims 18 or 20 and a 

pharmaceutical^ acceptable carrier. 

23. A method of diagnosing conditions associated 
with GrbIR-1 protein deficiency which comprises: 

(a) isolating a polynucleotide sample from an 
20 individual; 

(b) assaying the polynucleotide sample and a 
polynucleotide encoding GrbIR-1 having the nucleotide 
sequence as set forth in SEQ ID NO:l from nucleotide 289 
to 1897; and 

25 (c) comparing differences between the 

polynucleotide sample and the GrbIR-1 polynucleotide, 
wherein any differences indicate mutations in the GrblR- 
1 gene. 

24. A method of treating conditions which are 

30 related to insufficient GrbIR-1 protein function which 
comprises : 

(a) isolating cells from a patient deficient in 
GrbIR-1 protein function; 

(b) altering the cells by transfecting the 
35 polynucleotide of claim 1 into the cells wherein a 

GrbIR-1 protein is expressed; and 
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(c) introducing the cells back to the patient to 
alleviate the condition. 

25. A method of treating conditions which are 
related to insufficient GrbIR-1 protein function which 
comprises administering the polynucleotide of claim 1 to 
a patient deficient in GrbIR-1 protein function wherein 
a GrbIR-1 protein is expressed and alleviates the 
condition. 

26. a transgenic non-human animal capable of 
expressing in any cell thereof the DNA of claim 5. 
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GrbIR-1 
Grb-IR 
mGrblO 
hGrb7 
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MALAGCPDSF LHHPYYQDKV EQTPRSQQDP AGPGLPAQSD RLANHQEDDV 



51 100 

GrbIR-1 .MN ASLESLYSAC SMQS. .DTVP LLQNGQHARS QPRASGPPRS 

Grb-IR DLEALVNDMN ASLESLYSAC SMQS. .DTVP LLQNGQHARS QPRASGPPRS 

mGrblO MNNDIN SSVESLNSAC NMQSDTDTAP LLEDGQHASN QGAASSSR . 

hGrbl MELDLSPP HLSSSPEDL W PAPGTPPGTP 



GrbIR-1 
Grb-IR 
mGrblO 
hGrb7 



GrbIR-1 
Grb-IR 
mGrblO 
hGrb7 



101 

IQPQVSPRQR VQRSQPVHI. 
IQPQVSPRQR VQRSQPVHI, 
GQPQASPRQK MQRSQPVHI . 



150 

LAVRRLQEED QQFRTSSLPA IPNPFPELCG 
LAVRRLQEED QQFRTSSLPA IPNPFPELCG 
L..RRLQEED QQLRTASLPA IPNPFPELTG 
RPPDTPLPEE VKRSQPLLIP TTGRKLREEE R . . RATSLPS IPNPFPELCS 



151 



200 



..PGSPPVLT PGSL..PPSQ AAAKQ 

. . PGSPPVLT PGSL. . PPSQ AAAKQ [ . * 

AAPGSPPSVA PSSLPPPPSQ PPAKHCGRCE KWIPGENTRG NGKRKIWRWQ 
PPSQSPILGG PSSARGLLPR DASRPHV 



2 °* 250 
GrbIR-1 

Grb-IR [[[ .[][./.].] 

mGrblO FPPGFQLSKL TRPGLWTKTT ARFSKKQPKN QCPTDTVNPV ARMPTSQMEK 
hGrb7 



251 300 

GrbIR-1 DVKVF SEDGTSKWE ILADMTARDL CQLLVYKSHC VDDNSWTLVE 

Grb-IR DVKVF SEDGTSKWE ILADMTARDL CQLLVYKSHC VDDNSWTLVE 

mGrblO LRLRKDVKVF SEDGTSKWE ILTDMTARDL CQLLVYKSHC VDDNSWTLVE 

hGrb7 VKVY SEDGACRSVE VAAGATARHV CEMLVQRAHA LSDETWGLVE 



301 350 

GrbIR-1 HHPHLGLERC LEDHELWQV EST. .MAS ES KFLFRKNYAK YEFFK.NPMN 

Grb-IR HHPHLGLERC LEDHELWQV EST. .MASES KFLFRKNYAK YEFFK . NPMN 

mGrblO HHPQLGLERC LEDHEIWQV EST. .MPSES KFLFRKNYAK YEFFK. NPVN 
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